Search CORE

195 research outputs found

On the Generation of Medical Question-Answer Pairs

Author: Du Nan
Fan Wei
Ge Shen
Li Yaliang
Liang Xingzheng
Shen Sheng
Wang Kai
Wu Xian
Xie Yusheng
Yang Tao
Publication venue
Publication date: 06/12/2019
Field of study

Question answering (QA) has achieved promising progress recently. However, answering a question in real-world scenarios like the medical domain is still challenging, due to the requirement of external knowledge and the insufficient quantity of high-quality training data. In the light of these challenges, we study the task of generating medical QA pairs in this paper. With the insight that each medical question can be considered as a sample from the latent distribution of questions given answers, we propose an automated medical QA pair generation framework, consisting of an unsupervised key phrase detector that explores unstructured material for validity, and a generator that involves a multi-pass decoder to integrate structural knowledge for diversity. A series of experiments have been conducted on a real-world dataset collected from the National Medical Licensing Examination of China. Both automatic evaluation and human annotation demonstrate the effectiveness of the proposed method. Further investigation shows that, by incorporating the generated QA pairs for training, significant improvement in terms of accuracy can be achieved for the examination QA system.Comment: AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Multiple-Question Multiple-Answer Text-VQA

Author: Appalaraju Srikar
Mahadevan Vijay
Manmatha R.
Tang Peng
Xie Yusheng
Publication venue
Publication date: 14/11/2023
Field of study

We present Multiple-Question Multiple-Answer (MQMA), a novel approach to do text-VQA in encoder-decoder transformer models. The text-VQA task requires a model to answer a question by understanding multi-modal content: text (typically from OCR) and an associated image. To the best of our knowledge, almost all previous approaches for text-VQA process a single question and its associated content to predict a single answer. In order to answer multiple questions from the same image, each question and content are fed into the model multiple times. In contrast, our proposed MQMA approach takes multiple questions and content as input at the encoder and predicts multiple answers at the decoder in an auto-regressive manner at the same time. We make several novel architectural modifications to standard encoder-decoder transformers to support MQMA. We also propose a novel MQMA denoising pre-training task which is designed to teach the model to align and delineate multiple questions and content with associated answers. MQMA pre-trained model achieves state-of-the-art results on multiple text-VQA datasets, each with strong baselines. Specifically, on OCR-VQA (+2.5%), TextVQA (+1.4%), ST-VQA (+0.6%), DocVQA (+1.1%) absolute improvements over the previous state-of-the-art approaches

arXiv.org e-Print Archive

Contribution of the vertical movement of dissolved organic carbon to carbon allocation in two distinct soil types under Castanopsis fargesii Franch. and C. carlesii (Hemsl.) Hayata forests

Author: Chen Yuehmin
Gao Ren
Si Youtao
Xie Jinsheng
Xiong Li
Yang Yusheng
Zhu Jinmao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

International audienceAbstractKey messageThe vertical transport of dissolved organic carbon (DOC) is an important determinant of carbon distribution across a soil profile. The transport of DOC down a soil profile can be largely influenced by incoming DOC and soil organic carbon (SOC) levels, which insulate DOC from adsorption processes regulated by soil texture and Fe/Al mineralogy.ContextUncertainties about how soil properties affect DOC transport through the soil profile require study because soils can differ strongly with respect to texture or Fe/Al mineralogy and yet retain similar quantities of DOC.AimsThis study aimed to assess the role of incoming DOC and native SOC in regulating DOC migration in soils and investigate the contribution of DOC movement to SOC allocation.MethodsWe leached a standard DOC solution extracted from Castanopsis carlesii litter through two distinct soil types, using two leaching strategies: single leaching and sequential leaching. The two soil types under a natural Castanopsis carlesii (Hemsl.) Hayata forest and a natural Castanopsis fargesii Franch. forest, respectively, differ strongly with respect to soil texture, Fe/Al oxide abundances, and SOC nature.ResultsWith single leaching, where each of six soil layers making up an entire 0–100-cm soil depth profile received single doses of standard DOC solution, deeper soil layers retained more DOC than upper soil layers, with native SOC largely masking the effects of soil texture and Fe/Al mineralogy on DOC migration. Following sequential leaching, where a sixfold larger amount of standard DOC solution sequentially percolated through the six soil layers, the upper soil layers generally retained more DOC than deeper layers. Nevertheless, in sequential leaching, desorption-induced transfer of carbon from upper soil layers to deeper soil layers resulted in greater total carbon retention than in single leaching.ConclusionForest subsoils (40–100 cm) are well below C saturation, but DOC vertical movement from top soils only transfers limited organic carbon to them. However, DOC vertical movement may greatly alter SOC allocation along the top soil profile (0–40 cm), with part of outer sphere native SOC displaced by incoming DOC and migrating downwards, which is a natural way to preserve SOC

Towards Differential Relational Privacy and its use in Question Answering

Author: Achille Alessandro
Appalaraju Srikar
Bombari Simone
Mahadevan Vijay
Singh Kunwar Yashraj
Soatto Stefano
Wang Yu-Xiang
Wang Zijian
Xie Yusheng
Publication venue
Publication date: 01/01/2022
Field of study

Memorization of the relation between entities in a dataset can lead to privacy issues when using a trained model for question answering. We introduce Relational Memorization (RM) to understand, quantify and control this phenomenon. While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning. The difference is most pronounced when the data distribution is long-tailed, with many queries having only few training examples: Impeding general memorization prevents effective learning, while impeding only relational memorization still allows learning general properties of the underlying concepts. We formalize the notion of Relational Privacy (RP) and, inspired by Differential Privacy (DP), we provide a possible definition of Differential Relational Privacy (DrP). These notions can be used to describe and compute bounds on the amount of RM in a trained model. We illustrate Relational Privacy concepts in experiments with large-scale models for Question Answering

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)